Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making
نویسندگان
چکیده
Nonstationary Sequential Decision Making Samuel P. M. Choi, Dit-Yan Yeung, and Nevin L. Zhang Department of Computer Science, Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong fpmchoi,dyyeung,[email protected]
منابع مشابه
Solving Hidden-Semi-Markov-Mode Markov Decision Problems
Hidden-Mode Markov Decision Processes (HM-MDPs) were proposed to represent sequential decision-making problems in non-stationary environments that evolve according to a Markov chain. We introduce in this paper Hidden-Semi-Markov-Mode Markov Decision Processes (HS3MDPs), a generalization of HM-MDPs to the more realistic case of non-stationary environments evolving according to a semi-Markov chai...
متن کاملSolving Hidden-Mode Markov Decision Problems
Hidden-Mode Markov decision processes (HM-MDPs) are a novel mathematical framework for a subclass of nonstationary reinforcement learning problems where environment dynamics change over time according to a Markov process. HM-MDPs are a special case of partially observable Markov decision processes (POMDPs), and therefore nonstationary problems of this type can in principle be addressed indirect...
متن کاملHidden-Mode Markov Decision Processes
Samuel P. M. Choi Dit-Yan Yeung Nevin L. Zhang [email protected] [email protected] [email protected] Department of Computer Science, Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong Abstract Traditional reinforcement learning (RL) assumes that environment dynamics do not change over time (i.e., stationary). This assumption, however, is not realistic in many real-...
متن کاملAn Environment Model for Nonstationary Reinforcement Learning
Reinforcement learning in nonstationary environments is generally regarded as an important and yet difficult problem. This paper partially addresses the problem by formalizing a subclass of nonsta-tionary environments. The environment model, called hidden-mode Markov decision process (HM-MDP), assumes that environmental changes are always confined to a small number of hidden modes. A mode basic...
متن کاملA Heuristic Search Algorithm for Acting Optimally in Markov Decision Processes with Deterministic Hidden State
We propose a heuristic search algorithm for finding optimal policies in a new class of sequential decision making problems. This class extends Markov decision processes by a limited type of hidden state, paying tribute to the fact that many robotic problems indeed possess hidden state. The proposed search algorithm exploits the problem formulation to devise a fast bound-searching algorithm, whi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001